“Data -Driven” Ontologies for an Information Extraction System from Polish Mammography Reports
نویسندگان
چکیده
The paper describes the ontology development for an IE (Information Extraction) application for Polish mammography reports, experiences and lessons learned, and the evaluation of the system. Information extraction requires prior knowledge on data structures we would like to identify. When information being searched for is as complicated as this contained in mammography reports, a theoretical approach of using the predefined domain knowledge is required. For our research goal, extraction of possibly all precise information from mammography reports, ontology has been chosen as platform of knowledge representation. During the work two ontologies have been developed, the first based mainly on BI-RADS [5], the second adjusted to the task of information extraction. The paper is structured as follows: section 2 relates our experiences concerning the reuse of existing ontologies, sections 3 and 4 present respectively initial mammographic ontology and modified model adapted for the task of information extraction, section 5 presents the IE system and results of information extractions, section 6 concludes the paper.
منابع مشابه
Making Shallow Look Deeper: Anaphora and Comparisons in Medical Information Extraction
The paper focuses on resolving natural language issues which have been affecting performance of our system processing Polish medical data. In particular, we address phenomena such as ellipsis, anaphora, comparisons, coordination and negation occurring in mammogram reports. We propose practical data-driven solutions which allow us to improve the system’s performance.
متن کاملInformation Extraction from Patients' Free Form Documentation
The paper presents two rule-based information extraction (IE) from two types of patients’ documentation in Polish. For both document types, values of sets of attributes were assigned using specially designed grammars. 1 Method/General Assumptions Various rule-based, statistical, and machine learning methods have been developed for the purpose of information extraction. Unfortunately, they have ...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملAutomatic abstraction of imaging observations with their characteristics from mammography reports
BACKGROUND Radiology reports are usually narrative, unstructured text, a format which hinders the ability to input report contents into decision support systems. In addition, reports often describe multiple lesions, and it is challenging to automatically extract information on each lesion and its relationships to characteristics, anatomic locations, and other information that describes it. The ...
متن کاملAnnotation for Information Extraction from Mammography Reports
Inter and intra-observer variability in mammographic interpretation is a challenging problem, and decision support systems (DSS) may be helpful to reduce variation in practice. Since radiology reports are created as unstructured text reports, Natural language processing (NLP) techniques are needed to extract structured information from reports in order to provide the inputs to DSS. Before creat...
متن کامل